The research question for this report was to determine the question that Electric vehicles reduce direct emissions, but does the electricity used to charge them actually come from clean sources? Therefore, one of the sub-questions would be to determine how the share of renewable energy has changed from 2021 to 2023. the other question would be to determine the proportion of electricty used in each state comes from clean sources between 2021 and 2023. the other sub-question would be to determine whethe the states with a higher share of renewable electricity are also the ones with more EV registrations in 2023.Finally the other sub-question would be to identify how much renewable electricity is available oer registered EV in each state.
The following datasets were used to carry out the needed analysis;
1. renew_use_2021.csv,
2. renew_use_2022.csv,
3. renew_use_2023.csv 2021–2023
4. total_energy_use_2021.csv,
5. total_energy_use_2022.csv,
6. total_energy_use_2023.csv
7. ev-registrationsby_state_2023.csv
library(tidyverse)
## Warning: package 'ggplot2' was built under R version 4.4.3
## Warning: package 'purrr' was built under R version 4.4.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(janitor)
## Warning: package 'janitor' was built under R version 4.4.3
##
## Attaching package: 'janitor'
##
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
library(sf)
## Warning: package 'sf' was built under R version 4.4.3
## Linking to GEOS 3.13.0, GDAL 3.10.1, PROJ 9.5.1; sf_use_s2() is TRUE
library(viridis)
## Loading required package: viridisLite
library(readxl)
data_path <- "C:/Users/ThinkPad/Desktop/Project 4/"
# Load and clean renewable and total energy data
renew21 <- read_csv(paste0(data_path, "renew-use-2021.csv")) %>% clean_names() %>% mutate(year = 2021)
## Rows: 260 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): State, Energy_Source, Renewable_Use_2021
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
renew22 <- read_csv(paste0(data_path, "renew-use-2022.csv")) %>% clean_names() %>% mutate(year = 2022)
## Rows: 260 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): State, Energy_Source, Renewable_Use_2022
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
renew23 <- read_csv(paste0(data_path, "renew-use-2023.csv")) %>% clean_names() %>% mutate(year = 2023)
## Rows: 260 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): State, Energy_Source, Renewable_Use_2023
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
total21 <- read_csv(paste0(data_path, "total-use-2021.csv")) %>% clean_names() %>% mutate(year = 2021)
## Rows: 52 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Energy_Source
## dbl (5): Coal, Natural Gas†, Petroleum (BTU), nuclear, total_renewable_energy
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
total22 <- read_csv(paste0(data_path, "total-use-2022.csv")) %>% clean_names() %>% mutate(year = 2022)
## Rows: 52 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Energy_Source
## dbl (5): coal Consumption, Natural-Gas, petroleum (btu), Nuclear Energy†, to...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
total23 <- read_csv(paste0(data_path, "total-use-2023.csv")) %>% clean_names() %>% mutate(year = 2023)
## Rows: 52 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Energy_Source
## dbl (5): coal_usage, NaturalGas, petroleum (BTU), nuclear-energy †, total re...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ev_file <- "C:/Users/ThinkPad/Desktop/Project 4/ev-registrations-by-state-2023.csv"
ev23 <- read_csv(ev_file) %>%
clean_names()
## Rows: 52 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): STATE, Count-EVs
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ev23 <- ev23 %>%
mutate(state = str_to_title(state))
head(ev23)
# average price from 2021 to 2023
price_file <- "C:/Users/ThinkPad/Desktop/Project 4/av-energy-price-2021-2023.xlsx"
av_price <- read_excel(price_file) %>%
clean_names()
av_price <- av_price %>%
mutate(state = str_to_title(state))
head(av_price)
# we create an average price for the three years as;
av_price <- av_price %>%
mutate(
avg_price = rowMeans(select(., av_price_2021, av_price_2022, av_price_2023), na.rm = TRUE)
) %>%
select(state, avg_price)
head(av_price)
## cleaning
ev23 <- ev23 %>%
mutate(
count_e_vs = str_replace_all(count_e_vs, "[^0-9]", ""),
count_e_vs = as.numeric(count_e_vs)
)
head(ev23)
## cleaning renewable energy data
## The corrected data
Project4 <- read_csv("C:/Users/ThinkPad/Desktop/Project 4/Project4.csv")%>%
clean_names()
## Rows: 51 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): State_Symbol, State, av_price_2021, av_price_2022, av_price_2023
## dbl (7): Renewable_Use_2021, Renewable_Use_2022, Renewable_Use_2023, total_r...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
glimpse(Project4)
## Rows: 51
## Columns: 12
## $ state_symbol <chr> "AK", "AL", "AR", "AZ", "CA", "CO", "CT",…
## $ state <chr> "Alaska", "Alabama", "Arkansas", "Arizona…
## $ renewable_use_2021 <dbl> 3153.00, 13150.00, 29635.33, 19853.20, 33…
## $ renewable_use_2022 <dbl> 2169.00, 9525.75, 18164.80, 15030.50, 260…
## $ renewable_use_2023 <dbl> 1671.00, 8287.25, 21617.25, 21689.00, 213…
## $ total_renewable_energy_2021 <dbl> 9597, 239817, 89714, 99266, 810020, 10395…
## $ total_renewables_energy_2022 <dbl> 10410, 232035, 90825, 101215, 880995, 114…
## $ total_renewable_energy_2023 <dbl> 10087, 222189, 87277, 108445, 1065179, 11…
## $ av_price_2021 <chr> "20.03", "17.85", "$18.42", "25.07", "$28…
## $ av_price_2022 <chr> "$27.33", "23.37", "$23.84", "31.72", "$3…
## $ av_price_2023 <chr> "$23.84", "21.11", "$21.76", "30.28", "$3…
## $ e_vs <dbl> 13047, 2697, 89798, 7108, 1256646, 90083,…
summary(select(Project4,renewable_use_2021, renewable_use_2022,renewable_use_2023,total_renewable_energy_2021,total_renewables_energy_2022, total_renewable_energy_2023, av_price_2021,av_price_2022,av_price_2023, e_vs))
## renewable_use_2021 renewable_use_2022 renewable_use_2023
## Min. : 479.8 Min. : 524.6 Min. : 559.2
## 1st Qu.: 10485.2 1st Qu.: 6976.8 1st Qu.: 11218.0
## Median : 19028.4 Median : 15503.2 Median : 21689.0
## Mean : 32175.9 Mean : 29731.2 Mean : 30513.5
## 3rd Qu.: 37405.7 3rd Qu.: 33952.8 3rd Qu.: 36935.7
## Max. :334025.0 Max. :260168.3 Max. :213035.8
## total_renewable_energy_2021 total_renewables_energy_2022
## Min. : 2487 Min. : 2622
## 1st Qu.: 62928 1st Qu.: 69674
## Median : 127382 Median : 116472
## Mean : 298945 Mean : 316703
## 3rd Qu.: 195524 3rd Qu.: 202488
## Max. :7646167 Max. :8107353
## total_renewable_energy_2023 av_price_2021 av_price_2022
## Min. : 2795 Length:51 Length:51
## 1st Qu.: 69954 Class :character Class :character
## Median : 115678 Mode :character Mode :character
## Mean : 319545
## 3rd Qu.: 192632
## Max. :8187317
## av_price_2023 e_vs
## Length:51 Min. : 959
## Class :character 1st Qu.: 8108
## Mode :character Median : 25565
## Mean : 69911
## 3rd Qu.: 71152
## Max. :1256646
project_long <- Project4 %>%
pivot_longer(
cols = matches("renewable_use_|total_.*_202\\d|av_price_"),
names_to = c(".value", "year"),
names_pattern = "(.*)_(202\\d)"
)
## This code cleans and converts variables
project_long <- project_long %>%
mutate(
year = as.integer(year),
renewable_use = as.numeric(str_replace_all(renewable_use, "[^0-9\\.]", "")),
total_renewable_energy = as.numeric(str_replace_all(total_renewable_energy, "[^0-9\\.]", "")),
av_price = as.numeric(str_replace_all(av_price, "[$]", "")),
e_vs = as.numeric(e_vs) # EV column
) %>%
select(state_symbol, state, year, renewable_use, total_renewable_energy, av_price, e_vs)
## Thi code creats the required variables for analysis
project_long <- project_long %>%
mutate(
pct_renewable = (renewable_use / total_renewable_energy) * 100,
ev_per_total_energy = e_vs / total_renewable_energy
)
head(project_long)
library(kableExtra)
## Warning: package 'kableExtra' was built under R version 4.4.3
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
summary_table <- project_long %>%
group_by(year) %>%
summarise(
Mean_Renewable_Use = round(mean(renewable_use, na.rm = TRUE), 2),
SD_Renewable_Use = round(sd(renewable_use, na.rm = TRUE), 2),
Mean_Total_Renewable_Energy = round(mean(total_renewable_energy, na.rm = TRUE), 2),
Mean_Pct_Renewable = round(mean(pct_renewable, na.rm = TRUE), 2),
Mean_EV_Ratio = round(mean(ev_per_total_energy, na.rm = TRUE), 4),
Mean_Price = round(mean(av_price, na.rm = TRUE), 2),
States_Observed = n()
)
kable(summary_table, caption = "Summary Statistics of Energy and EV Indicators by Year")
## Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
## Use 'xfun::attr2()' instead.
## See help("Deprecated")
## Warning in attr(x, "format"): 'xfun::attr()' is deprecated.
## Use 'xfun::attr2()' instead.
## See help("Deprecated")
| year | Mean_Renewable_Use | SD_Renewable_Use | Mean_Total_Renewable_Energy | Mean_Pct_Renewable | Mean_EV_Ratio | Mean_Price | States_Observed |
|---|---|---|---|---|---|---|---|
| 2021 | 32175.94 | 49455.22 | 298944.7 | 24.96 | 0.5922 | 20.71 | 51 |
| 2022 | 29731.17 | 44976.84 | NaN | NaN | NaN | 26.58 | 51 |
| 2023 | 30513.46 | 36219.08 | 319544.5 | 23.51 | 0.5530 | 24.78 | 51 |
library(leaflet)
## Warning: package 'leaflet' was built under R version 4.4.3
library(tigris)
## Warning: package 'tigris' was built under R version 4.4.3
## To enable caching of data, set `options(tigris_use_cache = TRUE)`
## in your R script or .Rprofile.
# To create maps, we will need to download US shapefiles as shown;
states_sf <- tigris::states(cb = TRUE) %>%
st_as_sf() %>%
filter(!STUSPS %in% c("AL","AK","AZ","AR","CA","CO","CT","DE","FL","GA","HI","ID","IL","IN","IA",
"KS","KY","LA","ME","MD","MA","MI","MN","MS","MO","MT","NE","NV","NH","NJ",
"NM","NY","NC","ND","OH","OK","OR","PA","RI","SC","SD","TN","TX","UT","VT",
"VA","WA","WV","WI","WY","DC"))
## Retrieving data for the year 2024
## | | | 0% | |= | 1% | |= | 2% | |== | 3% | |== | 4% | |=== | 4% | |=== | 5% | |==== | 5% | |===== | 7% | |====== | 8% | |====== | 9% | |======= | 10% | |======== | 11% | |======== | 12% | |========= | 12% | |========= | 13% | |========== | 14% | |========== | 15% | |============ | 17% | |============ | 18% | |============= | 19% | |============== | 20% | |============== | 21% | |=============== | 21% | |=============== | 22% | |================ | 23% | |================= | 24% | |================= | 25% | |================== | 25% | |================== | 26% | |=================== | 27% | |=================== | 28% | |==================== | 29% | |===================== | 30% | |===================== | 31% | |====================== | 31% | |====================== | 32% | |======================= | 32% | |======================= | 33% | |======================== | 34% | |======================== | 35% | |========================= | 35% | |========================= | 36% | |========================== | 37% | |=========================== | 39% | |============================= | 41% | |============================= | 42% | |============================== | 43% | |=============================== | 45% | |================================ | 45% | |================================= | 47% | |================================== | 48% | |================================== | 49% | |=================================== | 50% | |==================================== | 51% | |==================================== | 52% | |===================================== | 53% | |====================================== | 54% | |====================================== | 55% | |======================================== | 57% | |========================================= | 58% | |========================================= | 59% | |========================================== | 59% | |========================================== | 60% | |=========================================== | 61% | |============================================ | 62% | |============================================ | 63% | |============================================ | 64% | |============================================= | 65% | |============================================== | 65% | |============================================== | 66% | |=============================================== | 67% | |=============================================== | 68% | |================================================ | 68% | |================================================ | 69% | |================================================= | 70% | |================================================== | 71% | |================================================== | 72% | |=================================================== | 73% | |==================================================== | 74% | |==================================================== | 75% | |===================================================== | 75% | |===================================================== | 76% | |====================================================== | 77% | |======================================================= | 79% | |======================================================== | 80% | |======================================================== | 81% | |========================================================= | 81% | |========================================================== | 83% | |=========================================================== | 85% | |============================================================ | 85% | |============================================================ | 86% | |============================================================= | 87% | |============================================================== | 89% | |=============================================================== | 90% | |================================================================ | 91% | |================================================================= | 93% | |================================================================== | 95% | |=================================================================== | 95% | |=================================================================== | 96% | |==================================================================== | 97% | |===================================================================== | 99% | |======================================================================| 99% | |======================================================================| 100%
project_long <- project_long %>%
filter(year %in% c(2021, 2022, 2023))
# This code merges geometry with renewable per EV for 2023
energy_2023 <- project_long %>%
filter(year == 2023) %>%
mutate(state_name = str_to_title(state))
# This code will help in merging the desired states by state name
states_merged <- states_sf %>%
left_join(energy_2023, by = c("NAME" = "state_name"))
## This code creates a map using ggplot
library(tigris)
library(sf)
# Download state boundaries (50 states + DC)
states_sf <- states(cb = TRUE)
## Retrieving data for the year 2024
states_sf <- states_sf[!states_sf$NAME %in% c("Puerto Rico", "Guam",
"American Samoa",
"Commonwealth of the Northern Mariana Islands",
"United States Virgin Islands"), ]
states_merged <- left_join(states_sf, project_long, by = c("NAME" = "state"))
energy_2023 <- project_long %>%
filter(year == 2023)
states_merged <- left_join(states_sf, energy_2023, by = c("NAME" = "state"))
ggplot(states_merged) +
geom_sf(aes(fill = pct_renewable), color = "white") +
scale_fill_viridis_c(option = "plasma", na.value = "grey90") +
labs(
title = "Percentage of Renewable Energy by State (2023)",
fill = "% Renewable",
caption = "Data source: U.S. State Renewable Energy Dataset created by Sievers"
) +
theme_minimal()
# Filter for 2023
energy_2023 <- project_long %>%
filter(year == 2023) %>%
mutate(
EVs = as.numeric(e_vs),
total_renewable_2023 = as.numeric(total_renewable_energy),
renewable_per_ev = total_renewable_2023 / EVs
)
summary(energy_2023$renewable_per_ev)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.289 1.168 3.482 155.837 11.750 7188.162
states_sf <- states(cb = TRUE) %>% filter(!STUSPS %in% c("GU","AS","PR","VI","MP"))
## Retrieving data for the year 2024
states_merged <- states_sf %>%
left_join(energy_2023, by = c("NAME" = "state"))
## This code uses Leaflet package to create an interactive plot
pal <- colorNumeric("YlGnBu", domain = states_merged$pct_renewable)
leaflet(states_merged) %>%
addProviderTiles("CartoDB.Positron") %>%
addPolygons(
fillColor = ~pal(pct_renewable),
weight = 1,
opacity = 1,
color = "white",
dashArray = "3",
fillOpacity = 0.8,
label = ~paste0(NAME, ": ", round(pct_renewable, 1), "% Renewable"),
highlight = highlightOptions(
weight = 2,
color = "#666",
fillOpacity = 0.9,
bringToFront = TRUE
)
) %>%
addLegend(
pal = pal,
values = ~pct_renewable,
opacity = 0.7,
title = "Percent Renewable",
position = "bottomright"
)
## Warning: sf layer has inconsistent datum (+proj=longlat +datum=NAD83 +no_defs).
## Need '+proj=longlat +datum=WGS84'
library(gganimate)
## Warning: package 'gganimate' was built under R version 4.4.3
## No renderer backend detected. gganimate will default to writing frames to separate files
## Consider installing:
## - the `gifski` package for gif output
## - the `av` package for video output
## and restarting the R session
To answer the sub-question that aims to determine how much renewable energy is available per regiistered EV, we will proceed as follows;
# first, we filter data to only have 2023 because EV data did not have 2021 and 2022 data.
states_sf <- tigris::states(cb = TRUE) %>%
st_as_sf() %>%
filter(!STUSPS %in% c("GU", "AS", "PR", "VI", "MP"))
## Retrieving data for the year 2024
states_merged <- states_sf %>%
left_join(energy_2023, by = c("NAME" = "state"))
energy_2023 <- project_long %>%
filter(year == 2023) %>%
mutate(
state_name = str_to_title(state),
renewable_per_ev = total_renewable_energy / e_vs
)
# This code merges with states Shapefiles
states_merged <- states_sf %>%
left_join(energy_2023, by = c("NAME" = "state_name"))
# This code creates a color pallette
pal_ev <- colorNumeric(
palette = "YlOrRd",
domain = states_merged$renewable_per_ev,
na.color = "grey90"
)
leaflet(states_merged) %>%
addProviderTiles("CartoDB.Positron") %>%
addPolygons(
fillColor = ~pal_ev(renewable_per_ev),
weight = 1,
opacity = 1,
color = "white",
dashArray = "3",
fillOpacity = 0.8,
label = ~paste0(NAME, ": ", round(renewable_per_ev, 1), " units per EV"),
highlight = highlightOptions(
weight = 2,
color = "#666",
fillOpacity = 0.9,
bringToFront = TRUE
)
) %>%
addLegend(
pal = pal_ev,
values = ~renewable_per_ev,
opacity = 0.7,
title = "Renewable Energy per EV",
position = "bottomright"
)
## Warning: sf layer has inconsistent datum (+proj=longlat +datum=NAD83 +no_defs).
## Need '+proj=longlat +datum=WGS84'
project_long <- Project4 %>%
pivot_longer(
cols = c(renewable_use_2021:renewable_use_2023,
total_renewable_energy_2021:total_renewable_energy_2023,
av_price_2021:av_price_2023),
names_to = c(".value", "year"),
names_pattern = "(.*)_(\\d+)"
) %>%
mutate(
year = as.numeric(year),
e_vs = as.numeric(e_vs), # ensure EVs are numeric
pct_renewable = (renewable_use / total_renewable_energy) * 100,
ev_per_total_energy = total_renewable_energy / e_vs
)
## LOAD SHAPEFILES AND INCLUDE ALL 50 STATES
states_sf <- tigris::states(cb = TRUE) %>%
st_as_sf() %>%
filter(STUSPS %in% c(
"AL","AK","AZ","AR","CA","CO","CT","DE","FL","GA","HI","ID","IL","IN","IA",
"KS","KY","LA","ME","MD","MA","MI","MN","MS","MO","MT","NE","NV","NH","NJ",
"NM","NY","NC","ND","OH","OK","OR","PA","RI","SC","SD","TN","TX","UT","VT",
"VA","WA","WV","WI","WY","DC"
))
## Retrieving data for the year 2024
ggplot(states_merged) +
geom_sf(aes(fill = total_renewable_energy)) +
scale_fill_viridis_c(option = "plasma", na.value = "grey90") +
facet_wrap(~year) + # one map per year
labs(
title = "Total Renewable Energy by State (2021–2023)",
fill = "Renewable Energy",
caption = "Data source: Project 4"
) +
theme_minimal()
To start with, the maps showed variations in renewable energy across different states. States like Wyoming, Massachusetts among others had high renewable energies compared to states like Arkansas, Nevada, New Mexico and even California. Consequently, there is a gradual growth in renewable energy indicating efforts to expand the capacity of Clean energy.
Additionally, the results indicates that North eastern states have higher total renewable energy compared to other states.
These results implies that EVs are more environmentally beneficial in states with higher renewable energy shares and therefore, charging in states with low renewable energy may largely rely on fossil fuels.